50 research outputs found

    Automated Gaze-Based Mind Wandering Detection during Computerized Learning in Classrooms

    Get PDF
    We investigate the use of commercial off-the-shelf (COTS) eye-trackers to automatically detect mind wandering—a phenomenon involving a shift in attention from task-related to task-unrelated thoughts—during computerized learning. Study 1 (N = 135 high-school students) tested the feasibility of COTS eye tracking while students learn biology with an intelligent tutoring system called GuruTutor in their classroom. We could successfully track eye gaze in 75% (both eyes tracked) and 95% (one eye tracked) of the cases for 85% of the sessions where gaze was successfully recorded. In Study 2, we used this data to build automated student-independent detectors of mind wandering, obtaining accuracies (mind wandering F1 = 0.59) substantially better than chance (F1 = 0.24). Study 3 investigated context-generalizability of mind wandering detectors, finding that models trained on data collected in a controlled laboratory more successfully generalized to the classroom than the reverse. Study 4 investigated gaze- and video- based mind wandering detection, finding that gaze-based detection was superior and multimodal detection yielded an improvement in limited circumstances. We tested live mind wandering detection on a new sample of 39 students in Study 5 and found that detection accuracy (mind wandering F1 = 0.40) was considerably above chance (F1 = 0.24), albeit lower than offline detection accuracy from Study 1 (F1 = 0.59), a finding attributable to handling of missing data. We discuss our next steps towards developing gaze-based attention-aware learning technologies to increase engagement and learning by combating mind wandering in classroom contexts

    Improving fairness in machine learning systems: What do industry practitioners need?

    Full text link
    The potential for machine learning (ML) systems to amplify social inequities and unfairness is receiving increasing popular and academic attention. A surge of recent work has focused on the development of algorithmic tools to assess and mitigate such unfairness. If these tools are to have a positive impact on industry practice, however, it is crucial that their design be informed by an understanding of real-world needs. Through 35 semi-structured interviews and an anonymous survey of 267 ML practitioners, we conduct the first systematic investigation of commercial product teams' challenges and needs for support in developing fairer ML systems. We identify areas of alignment and disconnect between the challenges faced by industry practitioners and solutions proposed in the fair ML research literature. Based on these findings, we highlight directions for future ML and HCI research that will better address industry practitioners' needs.Comment: To appear in the 2019 ACM CHI Conference on Human Factors in Computing Systems (CHI 2019

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe

    Multimodal Affect Detection in the Wild: Accuracy, Availability, and Generalizability

    No full text
    ABSTRACT Affect detection is an important component of computerized learning environments that adapt the interface and materials to students' affect. This paper proposes a plan for developing and testing multimodal affect detectors that generalize across differences in data that are likely to occur in practical applications (e.g., time, demographic variables). Facial features and interaction log features are considered as modalities for affect detection in this scenario, each with their own advantages. Results are presented for completed work evaluating the accuracy of individual modality faceand interaction-based detectors, accuracy and availability of a multimodal combination of these modalities, and initial steps toward generalization of face-based detectors. Additional data collection needed for cross-culture generalization testing is also completed. Challenges and possible solutions for proposed cross-cultural generalization testing of multimodal detectors are also discussed

    What emotions do novices experience during their first computer programming learning session

    No full text
    Abstract. We conducted a study to track the emotions, their behavioral correlates, and relationship with performance when novice programmers learned the basics of computer programming in the Python language. Twenty-nine participants without prior programming experience completed the study, which consisted of a 25 minute scaffolding phase (with explanations and hints) and a 15 minute fadeout phase (no explanations or hints) with a computerized learning environment. Emotional states were tracked via retrospective self-reports in which learners viewed videos of their faces and computer screens recorded during the learning session and made judgments about their emotions at approximately 100 points. The results indicated that flow/engaged (23%), confusion (22%), frustration (14%), and boredom (12%) were the major emotions students experienced, while curiosity, happiness, anxiety, surprise, anger, disgust, fear, and sadness were comparatively rare. The emotions varied as a function of instructional scaffolds and were systematically linked to different student behaviors (idling, constructing code, running code). Boredom, flow/engaged, and confusion were also correlated with performance outcomes. Implications of our findings for affect-sensitive learning interventions are discussed

    Mind wandering during learning with an intelligent tutoring system

    No full text
    Mind wandering (zoning out) can be detrimental to learning outcomes in a host of educational activities, from reading to watching video lectures, yet it has received little attention in the field of intelligent tutoring systems (ITS). In the current study, participants self-reported mind wandering during a learning session with Guru, a dialogue-based ITS for biology. On average, participants interacted with Guru for 22 minutes and reported an average of 11.5 instances of mind wandering, or one instance every two minutes. The frequency of mind wandering was compared across five different phases of Guru (Common-Ground-Building Instruction, Intermittent Summary, Concept Map, Scaffolded Dialogue, and Cloze task), each requiring different learning strategies. The rate of mind wandering per minute was highest during the Common-Ground-Building Instruction and Scaffolded Dialogue phases of Guru. Importantly, there was significant negative correlation between mind wandering and learning, highlighting the need to address this phenomena during learning with ITSs

    Accuracy vs. Availability Heuristic in Multimodal Affect Detection in the Wild

    No full text
    ABSTRACT This paper discusses multimodal affect detection from a fusion of facial expressions and interaction features derived from students' interactions with an educational game in the noisy real-world context of a computer-enabled classroom. Log data of students' interactions with the game and face videos from 133 students were recorded in a computer-enabled classroom over a two day period. Human observers live annotated learning-centered affective states such as engagement, confusion, and frustration. The face-only detectors were more accurate than interaction-only detectors. Multimodal affect detectors did not show any substantial improvement in accuracy over the face-only detectors. However, the face-only detectors were only applicable to 65% of the cases due to face registration errors caused by excessive movement, occlusion, poor lighting, and other factors. Multimodal fusion techniques were able to improve the applicability of detectors to 98% of cases without sacrificing classification accuracy. Balancing the accuracy vs. applicability tradeoff appears to be an important feature of multimodal affect detection
    corecore